Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
Mais filtros










Intervalo de ano de publicação
1.
PLoS Comput Biol ; 14(4): e1006097, 2018 04.
Artigo em Inglês | MEDLINE | ID: mdl-29684010

RESUMO

Transposable elements (TEs) are repetitive nucleotide sequences that make up a large portion of eukaryotic genomes. They can move and duplicate within a genome, increasing genome size and contributing to genetic diversity within and across species. Accurate identification and classification of TEs present in a genome is an important step towards understanding their effects on genes and their role in genome evolution. We introduce TE-Learner, a framework based on machine learning that automatically identifies TEs in a given genome and assigns a classification to them. We present an implementation of our framework towards LTR retrotransposons, a particular type of TEs characterized by having long terminal repeats (LTRs) at their boundaries. We evaluate the predictive performance of our framework on the well-annotated genomes of Drosophila melanogaster and Arabidopsis thaliana and we compare our results for three LTR retrotransposon superfamilies with the results of three widely used methods for TE identification or classification: RepeatMasker, Censor and LtrDigest. In contrast to these methods, TE-Learner is the first to incorporate machine learning techniques, outperforming these methods in terms of predictive performance, while able to learn models and make predictions efficiently. Moreover, we show that our method was able to identify TEs that none of the above method could find, and we investigated TE-Learner's predictions which did not correspond to an official annotation. It turns out that many of these predictions are in fact strongly homologous to a known TE.


Assuntos
Aprendizado de Máquina , Retroelementos , Sequências Repetidas Terminais , Animais , Arabidopsis/genética , Proteínas de Arabidopsis/genética , Biologia Computacional , Sequência Conservada , DNA de Plantas/genética , Árvores de Decisões , Proteínas de Drosophila/genética , Drosophila melanogaster/genética , Evolução Molecular , Genoma de Inseto , Genoma de Planta , Software
2.
J Comput Biol ; 25(5): 517-527, 2018 05.
Artigo em Inglês | MEDLINE | ID: mdl-29297699

RESUMO

Profile hidden Markov models (pHMMs) have been used to search for transposable elements (TEs) in genomes. For the learning of pHMMs aimed to search for TEs of the retrotransposon class, the conventional protocol is to use the whole internal nucleotide portions of these elements as representative sequences. To further explore the potential of pHMMs in such a search, we propose five alternative ways to obtain the sets of representative sequences of TEs other than the conventional protocol. In this study, we are interested in Bel-PAO, Copia, Gypsy, and DIRS superfamilies from the retrotransposon class. We compared the pHMMs of all six protocols. The test results show that, for each TE superfamily, the pHMMs of at least two of the proposed protocols performed better than the conventional one and that the number of correct predictions provided by the latter can be improved by considering together the results of one or more of the alternative protocols.


Assuntos
Drosophila melanogaster/genética , Genoma , Cadeias de Markov , Retroelementos , Animais , Evolução Molecular
3.
Bioinformatics ; 31(11): 1836-8, 2015 Jun 01.
Artigo em Inglês | MEDLINE | ID: mdl-25638811

RESUMO

Profile hidden Markov models (profile HMMs) are known to efficiently predict whether an amino acid (AA) sequence belongs to a specific protein family. Profile HMMs can also be used to search for protein domains in genome sequences. In this case, HMMs are typically learned from AA sequences and then used to search on the six-frame translation of nucleotide (NT) sequences. However, this approach demands additional processing of the original data and search results. Here, we propose an alternative and more direct method which converts an AA alignment into an NT one, after which an NT-based HMM is trained to be applied directly on a genome.


Assuntos
Genômica/métodos , Alinhamento de Sequência/métodos , Análise de Sequência de Proteína/métodos , Animais , Bactérias/enzimologia , Bactérias/genética , Proteínas Fúngicas/química , Proteínas Fúngicas/genética , Cadeias de Markov , Monoéster Fosfórico Hidrolases/química , Monoéster Fosfórico Hidrolases/genética , Estrutura Terciária de Proteína , Ribonuclease H/química
4.
Genet. mol. biol ; 30(3,suppl): 965-971, 2007. graf, tab
Artigo em Inglês | LILACS | ID: lil-467274

RESUMO

The Citrus ESTs Sequencing Project (CitEST) conducted at Centro APTA Citros Sylvio Moreira/IAC has identified and catalogued ESTs representing a set of citrus genes expressed under relevant stress responses, including diseases such as citrus variegated chlorosis (CVC), caused by Xylella fastidiosa. All sweet orange (Citrus sinensis L. Osb.) varieties are susceptible to X. fastidiosa. On the other hand, mandarins (C. reticulata Blanco) are considered tolerant or resistant to the disease, although the bacterium can be sporadically detected within the trees, but no disease symptoms or economic losses are observed. To study their genetic responses to the presence of X. fastidiosa, we have compared EST libraries of leaf tissue of sweet orange Pêra IAC (highly susceptible cultivar to X. fastidiosa) and mandarin ‘Ponkan’ (tolerant) artificially infected with the bacterium. Using an in silico differential display, 172 genes were found to be significantly differentially expressed in such conditions. Sweet orange presented an increase in expression of photosynthesis related genes that could reveal a strategy to counterbalance a possible lower photosynthetic activity resulting from early effects of the bacterial colonization in affected plants. On the other hand, mandarin showed an active multi-component defense response against the bacterium similar to the non-host resistance pattern.

5.
Genet. mol. biol ; 28(3,suppl): 634-639, Nov. 2005. tab, graf
Artigo em Inglês | LILACS | ID: lil-440445

RESUMO

Transposable elements (TE) are major components of eukaryotic genomes and involved in cell regulation and organism evolution. We have analyzed 123,889 expressed sequence tags of the Eucalyptus Genome Project database and found 124 sequences representing 76 TE in 9 groups, of which copia, MuDR and FAR1 groups were the most abundant. The low amount of sequences of TE may reflect the high efficiency of repression of these elements, a process that is called TE silencing. Frequency of groups of TE in Eucalyptus libraries which were prepared with different tissues or physiologic conditions from seedlings or adult plants indicated that developing plants experience the expression of a much wider spectrum of TE groups than that seen in adult plants. These are preliminary results that identify the most relevant TE groups involved with Eucalyptus development, which is important for industrial wood production


Assuntos
Elementos de DNA Transponíveis , Eucalyptus/genética , Genoma de Planta , Células Eucarióticas , Etiquetas de Sequências Expressas
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...